AITopics | value function approximation

Collaborating Authors

value function approximation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Continuous-time Value Function Approximation in Reproducing Kernel Hilbert Spaces

Neural Information Processing SystemsMar-16-2026, 21:57:16 GMT

Motivated by the success of reinforcement learning (RL) for discrete-time tasks such as AlphaGo and Atari games, there has been a recent surge of interest in using RL for continuous-time control of physical systems (cf.

artificial intelligence, machine learning, proceedings, (3 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games > Computer Games (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.80)

Add feedback

14da15db887a4b50efe5c1bc66537089-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-7-2026, 14:03:05 GMT

marginal utility objective, objective, value function approximation, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.40)

Add feedback

Policy Gradient With Value Function Approximation For Collective Multiagent Planning

Neural Information Processing SystemsNov-21-2025, 16:02:12 GMT

Decentralized (PO)MDPs provide an expressive framework for sequential decision making in a multiagent system. Given their computational complexity, recent research has focused on tractable yet practical subclasses of Dec-POMDPs. We address such a subclass called CDec-POMDP where the collective behavior of a population of agents affects the joint-reward and environment dynamics. Our main contribution is an actor-critic (AC) reinforcement learning method for optimizing CDec-POMDP policies. Vanilla AC has slow convergence for larger problems. To address this, we show how a particular decomposition of the approximate action-value function over agents leads to effective updates, and also derive a new way to train the critic based on local reward signals. Comparisons on a synthetic benchmark and a real world taxi fleet optimization problem show that our new AC approach provides better quality solutions than previous best approaches.

collective multiagent planning, policy gradient, value function approximation, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.44)

Add feedback

Natural Value Approximators: Learning when to Trust Past Estimates

Zhongwen Xu, Joseph Modayil, Hado P. van Hasselt, Andre Barreto, David Silver, Tom Schaul

Neural Information Processing SystemsNov-21-2025, 14:02:02 GMT

Furthermore, as the interpolation is learned and state-dependent, our method can deal with heterogeneous observability.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Industry:

Leisure & Entertainment > Games (0.47)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-9-2025, 14:09:05 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This is a very well-written paper that explores the use of weighted importance sampling to speed up learning in off-policy LSTD-type algorithms. The theoretical results are solid and what one would expect. The computational results are striking. The technique could serve as a useful component in design of RL algorithms. Q2: Please summarize your review in 1-2 sentences The paper is very well-written and presents a useful idea validated by striking computational results.

algorithm, function approximation, value function approximation, (11 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.50)

Add feedback

Bayes-Adaptive Simulation-based Search with Value Function Approximation

Arthur Guez, Nicolas Heess, David Silver, Peter Dayan

Neural Information Processing SystemsOct-2-2025, 22:53:43 GMT

Neural Information Processing Systems http://nips.cc/

bayes-adaptive simulation-based search, value function approximation

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.40)

Add feedback

Fairness in Multi-Agent Sequential Decision-Making

Chongjie Zhang, Julie A. Shah

Neural Information Processing SystemsOct-2-2025, 21:37:31 GMT

We define a fairness solution criterion for multi-agent decision-making problems, where agents have local interests. This new criterion aims to maximize the worst performance of agents with a consideration on the overall performance. We develop a simple linear programming approach and a more scalable game-theoretic approach for computing an optimal fairness policy. This game-theoretic approach formulates this fairness optimization as a two-player zero-sum game and employs an iterative algorithm for finding a Nash equilibrium, corresponding to an optimal fairness policy.

agent, criterion, multi-agent mdp, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Slovenia (0.04)

Industry: Telecommunications (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.49)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-2-2025, 17:57:10 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The paper presents an interesting idea of using a robust formulation to fit a value function given aggregate states. The robust formulation leads to a much stronger error bound than what is achieved by regular approximate value iteration. The key is in using the algorithm to effectively select weights applied to states within each aggregate. On the negative side, the authors do not do a good job at presenting their notation and algorithm in a clear way, and the computational results are difficult to understand.

aggregation, algorithm, approximation error, (11 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.05)

Technology: Information Technology > Artificial Intelligence (0.73)

Add feedback

Basis refinement strategies for linear value function approximation in MDPs

Gheorghe Comanici, Doina Precup, Prakash Panangaden

Neural Information Processing SystemsOct-2-2025, 11:56:47 GMT

We provide a theoretical framework for analyzing basis function construction for linear value function approximation in Markov Decision Processes (MDPs). We show that important existing methods, such as Krylov bases and Bellman-error-based methods are a special case of the general framework we develop. We provide a general algorithmic framework for computing basis function refinements which "respect" the dynamics of the environment, and we derive approximation error bounds that apply for any algorithm respecting this general framework. We also show how, using ideas related to bisimulation metrics, one can translate basis refinement into a process of finding "prototypes" that are diverse enough to represent the given MDP .

bisimulation metric, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

14da15db887a4b50efe5c1bc66537089-AuthorFeedback.pdf

Neural Information Processing SystemsOct-2-2025, 04:42:42 GMT

We would like to thank the reviewers for their insightful comments. Addressing the common point of limiting our experimentation to a single-decision setting, our intent was to focus our analysis only on the effects of candidate generation. By removing the influences of other factors on the performance of search, for instance, rollout policies and state value function approximations, we can focus the evaluation. We are aware that the sequential-decision setting requires extra reasoning. We would argue, though, that the other components of learning algorithms for search try to ameliorate the amount of reasoning needed --- indeed, learning a perfect value function approximation would essentially reduce a sequential-decision problem to a single-decision problem. However, we do plan on examining our ideas in a full MCTS setting, which we think is a problem deserving its own investigation.

artificial intelligence, objective, value function approximation, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.60)

Add feedback